Descargando datos¶

In [1]:
!gdown 1U89A4c4oGSx5w5-XC-Ump--O7CphcL7M
Downloading...
From: https://drive.google.com/uc?id=1U89A4c4oGSx5w5-XC-Ump--O7CphcL7M
To: /content/eclipse_data_enriched_5000_years.csv
100% 5.21M/5.21M [00:00<00:00, 83.1MB/s]

Exploracion de los datos¶

In [2]:
import pandas as pd
In [3]:
data = pd.read_csv("https://eclipse.gsfc.nasa.gov/eclipse_besselian_from_mysqldump2.csv")
data.head()
Out[3]:
year month day td_ge dt luna_num saros eclipse_type gamma magnitude ... tan_f2 tmin tmax etype PNS UNS NCN nSer nSeq nJLE
0 -1999 6 12 03:14:51 46438.2 -49456 5 T -0.27009 1.07329 ... 0.004578 -3.0 3.0 1 0 0 0 73 41 4
1 -1999 12 5 23:45:23 46426.5 -49450 10 A -0.23172 0.93818 ... 0.004732 -3.0 3.0 2 0 0 0 73 27 40
2 -1998 6 1 18:09:16 46414.6 -49444 15 T 0.49936 1.02844 ... 0.004573 -3.0 3.0 1 1 0 0 75 32 20
3 -1998 11 25 05:57:03 46402.8 -49438 20 A -0.90454 0.98056 ... 0.004737 -3.0 3.0 2 -1 0 0 72 17 20
4 -1997 4 22 13:19:56 46392.9 -49433 -13 P -1.46705 0.16108 ... 0.004576 -3.0 3.0 4 -1 -1 -1 73 72 1

5 rows × 54 columns

In [4]:
data_kaggle = pd.read_csv('/content/eclipse_data_enriched_5000_years.csv')
data_kaggle.head()
Out[4]:
Catalog Number Calendar Date Eclipse Time Delta T (s) Lunation Number Saros Number Eclipse Type Gamma Eclipse Magnitude Latitude ... EII Year Modulus HEAS Decade Localized ESC ESC Moving Average ESC Wide-Scale Moving Average Eclipse Interval Cluster Cluster 6
0 1 -1999 June 12 03:14:51 46438 -49456 5 T -0.2701 0.992601 6.0N ... 0.662068 1999 0.333667 -2000 1.556657 NaN NaN 0.333333 0 1
1 2 -1999 December 5 23:45:23 46426 -49450 10 A -0.2317 0.867659 32.9S ... 0.608567 1999 0.333667 -2000 1.556657 NaN NaN 0.500000 0 1
2 3 -1998 June 1 18:09:16 46415 -49444 15 T 0.4994 0.951077 46.2N ... 0.498677 1998 0.334000 -2000 1.792195 NaN NaN 0.500000 0 1
3 4 -1998 November 25 05:57:03 46403 -49438 20 A -0.9045 0.906871 67.8S ... 0.389974 1998 0.334000 -2000 1.792195 NaN NaN 0.500000 0 1
4 5 -1997 April 22 13:19:56 46393 -49433 -13 P -1.4670 0.148987 60.6S ... NaN 1997 0.334333 -2000 2.004286 NaN NaN 0.472222 1 0

5 rows × 47 columns

In [5]:
data.columns
Out[5]:
Index(['year', 'month', 'day', 'td_ge', 'dt', 'luna_num', 'saros',
       'eclipse_type', 'gamma', 'magnitude', 'lat_ge', 'lng_ge', 'lat_dd_ge',
       'lng_dd_ge', 'sun_alt', 'sun_azm', 'path_width', 'central_duration',
       'duration_secs', 'cat_no', 'canon_plate', 'julian_date', 't0', 'x0',
       'x1', 'x2', 'x3', 'y0', 'y1', 'y2', 'y3', 'd0', 'd1', 'd2', 'mu0',
       'mu1', 'mu2', 'l10', 'l11', 'l12', 'l20', 'l21', 'l22', 'tan_f1',
       'tan_f2', 'tmin', 'tmax', 'etype', 'PNS', 'UNS', 'NCN', 'nSer', 'nSeq',
       'nJLE'],
      dtype='object')
In [ ]:
data_kaggle.columns
Out[ ]:
Index(['Catalog Number', 'Calendar Date', 'Eclipse Time', 'Delta T (s)',
       'Lunation Number', 'Saros Number', 'Eclipse Type', 'Gamma',
       'Eclipse Magnitude', 'Latitude', 'Longitude', 'Sun Altitude',
       'Sun Azimuth', 'Path Width (km)', 'Central Duration', 'Date Time',
       'Year', 'Month', 'Day', 'Visibility', 'Eclipse Latitude',
       'Eclipse Longitude', 'obliquity', 'Geographical Hemisphere',
       'Daytime/Nighttime', 'Sun Constellation', 'Inter-Eclipse Duration',
       'Visibility Score', 'Eclipse Classification', 'Duration in Seconds',
       'Moon Distance (km)', 'Sun Distance (km)',
       'Moon Angular Diameter (degrees)', 'Sun Angular Diameter (degrees)',
       'Central Duration Seconds', 'Normalized Duration',
       'Normalized Path Width', 'EII', 'Year Modulus', 'HEAS', 'Decade',
       'Localized ESC', 'ESC Moving Average', 'ESC Wide-Scale Moving Average',
       'Eclipse Interval', 'Cluster', 'Cluster 6'],
      dtype='object')
In [6]:
data.dtypes
Out[6]:
year                  int64
month                 int64
day                   int64
td_ge                object
dt                  float64
luna_num              int64
saros                 int64
eclipse_type         object
gamma               float64
magnitude           float64
lat_ge               object
lng_ge               object
lat_dd_ge           float64
lng_dd_ge           float64
sun_alt             float64
sun_azm             float64
path_width          float64
central_duration     object
duration_secs       float64
cat_no              float64
canon_plate         float64
julian_date         float64
t0                  float64
x0                  float64
x1                  float64
x2                  float64
x3                  float64
y0                  float64
y1                  float64
y2                  float64
y3                  float64
d0                  float64
d1                  float64
d2                  float64
mu0                 float64
mu1                 float64
mu2                 float64
l10                 float64
l11                 float64
l12                 float64
l20                 float64
l21                 float64
l22                 float64
tan_f1              float64
tan_f2              float64
tmin                float64
tmax                float64
etype                 int64
PNS                   int64
UNS                   int64
NCN                   int64
nSer                  int64
nSeq                  int64
nJLE                  int64
dtype: object
In [7]:
data_kaggle.dtypes
Out[7]:
Catalog Number                       int64
Calendar Date                       object
Eclipse Time                        object
Delta T (s)                          int64
Lunation Number                      int64
Saros Number                         int64
Eclipse Type                        object
Gamma                              float64
Eclipse Magnitude                  float64
Latitude                            object
Longitude                           object
Sun Altitude                         int64
Sun Azimuth                          int64
Path Width (km)                     object
Central Duration                    object
Date Time                           object
Year                                 int64
Month                                int64
Day                                  int64
Visibility                          object
Eclipse Latitude                   float64
Eclipse Longitude                  float64
obliquity                          float64
Geographical Hemisphere             object
Daytime/Nighttime                   object
Sun Constellation                   object
Inter-Eclipse Duration               int64
Visibility Score                   float64
Eclipse Classification              object
Duration in Seconds                float64
Moon Distance (km)                 float64
Sun Distance (km)                  float64
Moon Angular Diameter (degrees)    float64
Sun Angular Diameter (degrees)     float64
Central Duration Seconds           float64
Normalized Duration                float64
Normalized Path Width              float64
EII                                float64
Year Modulus                         int64
HEAS                               float64
Decade                               int64
Localized ESC                      float64
ESC Moving Average                 float64
ESC Wide-Scale Moving Average      float64
Eclipse Interval                   float64
Cluster                              int64
Cluster 6                            int64
dtype: object
In [8]:
data.shape
Out[8]:
(11898, 54)
In [9]:
data_kaggle.shape
Out[9]:
(11898, 47)
In [10]:
data.isnull().sum()
Out[10]:
year                0
month               0
day                 0
td_ge               0
dt                  0
luna_num            0
saros               0
eclipse_type        0
gamma               0
magnitude           0
lat_ge              0
lng_ge              0
lat_dd_ge           0
lng_dd_ge           0
sun_alt             0
sun_azm             0
path_width          0
central_duration    0
duration_secs       0
cat_no              0
canon_plate         0
julian_date         0
t0                  0
x0                  0
x1                  0
x2                  0
x3                  0
y0                  0
y1                  0
y2                  0
y3                  0
d0                  0
d1                  0
d2                  0
mu0                 0
mu1                 0
mu2                 0
l10                 0
l11                 0
l12                 0
l20                 0
l21                 0
l22                 0
tan_f1              0
tan_f2              0
tmin                0
tmax                0
etype               0
PNS                 0
UNS                 0
NCN                 0
nSer                0
nSeq                0
nJLE                0
dtype: int64
In [11]:
data_kaggle.isnull().sum()
Out[11]:
Catalog Number                         0
Calendar Date                          0
Eclipse Time                           0
Delta T (s)                            0
Lunation Number                        0
Saros Number                           0
Eclipse Type                           0
Gamma                                  0
Eclipse Magnitude                      0
Latitude                               0
Longitude                              0
Sun Altitude                           0
Sun Azimuth                            0
Path Width (km)                     4200
Central Duration                    4200
Date Time                              0
Year                                   0
Month                                  0
Day                                    0
Visibility                             0
Eclipse Latitude                       0
Eclipse Longitude                      0
obliquity                              0
Geographical Hemisphere                0
Daytime/Nighttime                      0
Sun Constellation                      0
Inter-Eclipse Duration                 0
Visibility Score                       0
Eclipse Classification                 0
Duration in Seconds                11898
Moon Distance (km)                     0
Sun Distance (km)                      0
Moon Angular Diameter (degrees)        0
Sun Angular Diameter (degrees)         0
Central Duration Seconds            4294
Normalized Duration                 4294
Normalized Path Width               4381
EII                                 4381
Year Modulus                           0
HEAS                                   0
Decade                                 0
Localized ESC                          0
ESC Moving Average                     9
ESC Wide-Scale Moving Average        804
Eclipse Interval                       0
Cluster                                0
Cluster 6                              0
dtype: int64

Limpieza y union de datos¶

In [12]:
def txt_to_secons(text_value):
  minutes = int(text_value[:2])
  secons = int(text_value[3:5])

  return (60 * minutes) + secons
In [13]:
def month_num_to_text(num_value):
  month = {
      1: "January",
      2: "February",
      3: "March",
      4: "April",
      5: "May",
      6: "June",
      7: "July",
      8: "August",
      9: "September",
      10: "October",
      11: "November",
      12: "December"
  }

  return month[num_value]
In [14]:
data_kaggle['Path Width (km)'] = data['path_width']
data_kaggle['Central Duration'] = data['central_duration']
data_kaggle['Central Duration Seconds'] = data['central_duration']
data_kaggle['Central Duration Seconds'] = data_kaggle['Central Duration Seconds'].apply(txt_to_secons)
data_kaggle['Month'] = data_kaggle['Month'].apply(month_num_to_text)
data_kaggle['Month'] = pd.Categorical(data_kaggle['Month'],categories=['January','February','March','April','May','June','July','August','September','October','November','December'],ordered=True)
data_kaggle['Duration in Seconds'] = data['duration_secs']
In [15]:
data_final = data_kaggle.drop(columns=['Catalog Number','Calendar Date','Eclipse Time','Date Time','Latitude','Longitude','Central Duration','Duration in Seconds','Cluster', 'Cluster 6'])
In [16]:
data_final.head()
Out[16]:
Delta T (s) Lunation Number Saros Number Eclipse Type Gamma Eclipse Magnitude Sun Altitude Sun Azimuth Path Width (km) Year ... Normalized Duration Normalized Path Width EII Year Modulus HEAS Decade Localized ESC ESC Moving Average ESC Wide-Scale Moving Average Eclipse Interval
0 46438 -49456 5 T -0.2701 0.992601 74 344 246.6 -1999 ... 0.534320 0.174066 0.662068 1999 0.333667 -2000 1.556657 NaN NaN 0.333333
1 46426 -49450 10 A -0.2317 0.867659 76 21 235.9 -1999 ... 0.543742 0.166314 0.608567 1999 0.333667 -2000 1.556657 NaN NaN 0.500000
2 46415 -49444 15 T 0.4994 0.951077 60 151 110.8 -1998 ... 0.181696 0.078224 0.498677 1998 0.334000 -2000 1.792195 NaN NaN 0.500000
3 46403 -49438 20 A -0.9045 0.906871 25 74 162.4 -1998 ... 0.099596 0.114165 0.389974 1998 0.334000 -2000 1.792195 NaN NaN 0.500000
4 46393 -49433 -13 P -1.4670 0.148987 0 281 0.0 -1997 ... NaN NaN NaN 1997 0.334333 -2000 2.004286 NaN NaN 0.472222

5 rows × 37 columns

In [17]:
data_final.isnull().sum()
Out[17]:
Delta T (s)                           0
Lunation Number                       0
Saros Number                          0
Eclipse Type                          0
Gamma                                 0
Eclipse Magnitude                     0
Sun Altitude                          0
Sun Azimuth                           0
Path Width (km)                       0
Year                                  0
Month                                 0
Day                                   0
Visibility                            0
Eclipse Latitude                      0
Eclipse Longitude                     0
obliquity                             0
Geographical Hemisphere               0
Daytime/Nighttime                     0
Sun Constellation                     0
Inter-Eclipse Duration                0
Visibility Score                      0
Eclipse Classification                0
Moon Distance (km)                    0
Sun Distance (km)                     0
Moon Angular Diameter (degrees)       0
Sun Angular Diameter (degrees)        0
Central Duration Seconds              0
Normalized Duration                4294
Normalized Path Width              4381
EII                                4381
Year Modulus                          0
HEAS                                  0
Decade                                0
Localized ESC                         0
ESC Moving Average                    9
ESC Wide-Scale Moving Average       804
Eclipse Interval                      0
dtype: int64
In [18]:
numeric_columns_with_nans = ['Normalized Duration', 'Normalized Path Width', 'EII', 'ESC Wide-Scale Moving Average', 'ESC Moving Average']
In [19]:
# Fill missing values with the median
for column in numeric_columns_with_nans:
  median_value = data_final[column].median()
  data_final[column] = data_final[column].fillna(median_value)
# Usaremos la mediana para completar los datos vacios para evitar la influencia de datos aislados.
In [20]:
data_final[numeric_columns_with_nans].isnull().sum()
Out[20]:
Normalized Duration              0
Normalized Path Width            0
EII                              0
ESC Wide-Scale Moving Average    0
ESC Moving Average               0
dtype: int64
In [21]:
data_final.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11898 entries, 0 to 11897
Data columns (total 37 columns):
 #   Column                           Non-Null Count  Dtype   
---  ------                           --------------  -----   
 0   Delta T (s)                      11898 non-null  int64   
 1   Lunation Number                  11898 non-null  int64   
 2   Saros Number                     11898 non-null  int64   
 3   Eclipse Type                     11898 non-null  object  
 4   Gamma                            11898 non-null  float64 
 5   Eclipse Magnitude                11898 non-null  float64 
 6   Sun Altitude                     11898 non-null  int64   
 7   Sun Azimuth                      11898 non-null  int64   
 8   Path Width (km)                  11898 non-null  float64 
 9   Year                             11898 non-null  int64   
 10  Month                            11898 non-null  category
 11  Day                              11898 non-null  int64   
 12  Visibility                       11898 non-null  object  
 13  Eclipse Latitude                 11898 non-null  float64 
 14  Eclipse Longitude                11898 non-null  float64 
 15  obliquity                        11898 non-null  float64 
 16  Geographical Hemisphere          11898 non-null  object  
 17  Daytime/Nighttime                11898 non-null  object  
 18  Sun Constellation                11898 non-null  object  
 19  Inter-Eclipse Duration           11898 non-null  int64   
 20  Visibility Score                 11898 non-null  float64 
 21  Eclipse Classification           11898 non-null  object  
 22  Moon Distance (km)               11898 non-null  float64 
 23  Sun Distance (km)                11898 non-null  float64 
 24  Moon Angular Diameter (degrees)  11898 non-null  float64 
 25  Sun Angular Diameter (degrees)   11898 non-null  float64 
 26  Central Duration Seconds         11898 non-null  int64   
 27  Normalized Duration              11898 non-null  float64 
 28  Normalized Path Width            11898 non-null  float64 
 29  EII                              11898 non-null  float64 
 30  Year Modulus                     11898 non-null  int64   
 31  HEAS                             11898 non-null  float64 
 32  Decade                           11898 non-null  int64   
 33  Localized ESC                    11898 non-null  float64 
 34  ESC Moving Average               11898 non-null  float64 
 35  ESC Wide-Scale Moving Average    11898 non-null  float64 
 36  Eclipse Interval                 11898 non-null  float64 
dtypes: category(1), float64(19), int64(11), object(6)
memory usage: 3.3+ MB
In [22]:
data_final.head()
Out[22]:
Delta T (s) Lunation Number Saros Number Eclipse Type Gamma Eclipse Magnitude Sun Altitude Sun Azimuth Path Width (km) Year ... Normalized Duration Normalized Path Width EII Year Modulus HEAS Decade Localized ESC ESC Moving Average ESC Wide-Scale Moving Average Eclipse Interval
0 46438 -49456 5 T -0.2701 0.992601 74 344 246.6 -1999 ... 0.534320 0.174066 0.662068 1999 0.333667 -2000 1.556657 1.955336 1.950866 0.333333
1 46426 -49450 10 A -0.2317 0.867659 76 21 235.9 -1999 ... 0.543742 0.166314 0.608567 1999 0.333667 -2000 1.556657 1.955336 1.950866 0.500000
2 46415 -49444 15 T 0.4994 0.951077 60 151 110.8 -1998 ... 0.181696 0.078224 0.498677 1998 0.334000 -2000 1.792195 1.955336 1.950866 0.500000
3 46403 -49438 20 A -0.9045 0.906871 25 74 162.4 -1998 ... 0.099596 0.114165 0.389974 1998 0.334000 -2000 1.792195 1.955336 1.950866 0.500000
4 46393 -49433 -13 P -1.4670 0.148987 0 281 0.0 -1997 ... 0.309556 0.131078 0.532898 1997 0.334333 -2000 2.004286 1.955336 1.950866 0.472222

5 rows × 37 columns

In [23]:
data_final.columns
Out[23]:
Index(['Delta T (s)', 'Lunation Number', 'Saros Number', 'Eclipse Type',
       'Gamma', 'Eclipse Magnitude', 'Sun Altitude', 'Sun Azimuth',
       'Path Width (km)', 'Year', 'Month', 'Day', 'Visibility',
       'Eclipse Latitude', 'Eclipse Longitude', 'obliquity',
       'Geographical Hemisphere', 'Daytime/Nighttime', 'Sun Constellation',
       'Inter-Eclipse Duration', 'Visibility Score', 'Eclipse Classification',
       'Moon Distance (km)', 'Sun Distance (km)',
       'Moon Angular Diameter (degrees)', 'Sun Angular Diameter (degrees)',
       'Central Duration Seconds', 'Normalized Duration',
       'Normalized Path Width', 'EII', 'Year Modulus', 'HEAS', 'Decade',
       'Localized ESC', 'ESC Moving Average', 'ESC Wide-Scale Moving Average',
       'Eclipse Interval'],
      dtype='object')

Descargando datos limpios¶

In [24]:
data_final.to_csv('data.csv', sep=',')

Graficos¶

In [25]:
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
In [26]:
# Correlation Analysis
eclipse_features = data_final[['Delta T (s)', 'Lunation Number', 'Saros Number',
                              'Gamma', 'Eclipse Magnitude', 'Sun Altitude',
                              'Sun Azimuth', 'Path Width (km)', 'Year',
                              'Day', 'Eclipse Latitude', 'Eclipse Longitude',
                              'obliquity', 'Inter-Eclipse Duration', 'Visibility Score',
                              'Moon Distance (km)', 'Sun Distance (km)',
                              'Moon Angular Diameter (degrees)', 'Sun Angular Diameter (degrees)',
                              'Central Duration Seconds', 'Normalized Duration',
                              'Normalized Path Width', 'EII', 'Year Modulus',
                              'HEAS', 'Decade', 'Localized ESC', 'ESC Moving Average',
                              'ESC Wide-Scale Moving Average', 'Eclipse Interval']]
correlation_matrix = eclipse_features.corr()
plt.figure(figsize=(25, 25))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Matrix of Eclipse Features')
plt.show()

# This heatmap provides insights into how different eclipse parameters are interrelated. For instance, it can show whether the magnitude of an eclipse is related to its geographic latitude.
In [27]:
# Geospatial Distribution of Eclipses
# Plotting the latitude and longitude to see the distribution of eclipse paths
plt.figure(figsize=(12, 6))
sns.scatterplot(data=data_final, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type', style='Eclipse Type')
plt.title('Distribucion Geografica de los Eclipses')
plt.xlabel('Longitud')
plt.ylabel('Latitud')
plt.grid(True)
plt.show()

# This visualization helps in understanding where most eclipses are visible from and highlights any geographical patterns or anomalies.
In [31]:
# Temporal Distribution of Eclipses
# Frequency of eclipses by year
eclipse_counts = data_final['Year'].value_counts().sort_index()
plt.figure(figsize=(12, 6))
eclipse_counts.plot(kind='line')
plt.title('Frecuencia de Eclipses a lo Largo del Tiempo')
plt.xlabel('Años')
plt.ylabel('Numero de Eclipses')
plt.grid(True)
plt.show()

# This plot will reveal any long-term trends or cyclic patterns in eclipse occurrences, which are important for understanding periodic astronomical phenomena.

types-of-solar-eclipses-infographic.webp

In [32]:
displot = sns.displot(data_final, x="Eclipse Type", shrink=.8)

plt.title('Distribucion de Tipos de Eclipse')
plt.xlabel('Tipo de Eclipse')
plt.ylabel('Cantidad')

# Improving overall aesthetics with seaborn's despine function to remove top and right borders
sns.despine()
plt.show()
In [33]:
data2_filtrado = data_final[(data_final['Decade'] >= 2000) & (data_final['Decade'] <= 3000)]
In [34]:
fig = px.scatter_geo(data2_filtrado,
                     lat='Eclipse Latitude',
                     lon='Eclipse Longitude',
                    #  color='Eclipse Magnitude',  # Variable para el mapa de colores
                     color='EII',  # Variable para el mapa de colores
                     hover_name='Eclipse Magnitude',  # Información adicional al pasar el mouse
                     title='Mapa de Influencia de Eclipses',
                    #  projection='natural earth',  # Tipo de proyección del mapa
                     color_continuous_scale=px.colors.sequential.Plasma)  # Escala de colores
# Ajustando el layout para centrar el título
fig.update_layout(
    title={
        'text': 'Mapa de Influencia de Eclipses',
        'y':0.9,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    }
)
fig.show()
In [35]:
fig = px.density_mapbox(data2_filtrado,
                        lat='Eclipse Latitude',
                        lon='Eclipse Longitude',
                        z = 'EII',
                        radius = 8,
                        zoom = 2,
                        mapbox_style = 'open-street-map')
fig.show()
In [36]:
plt.figure(figsize=(14, 7))

lineplot = sns.lineplot(
    x='Decade',
    y='Eclipse Magnitude',
    data=data_final.sort_values('Decade'),
    marker='o',  # Adds markers to each data point
    linestyle='-',  # Solid line
    color='royalblue',  # Line color
    linewidth=1.5  # Line width
)

plt.title('Trend of Eclipse Magnitude Over Decades')
plt.xlabel('Decade')
plt.ylabel('Eclipse Magnitude')

plt.xticks(fontsize=12, rotation=45)
plt.yticks(fontsize=12)

# Highlighting specific points or trends if needed (e.g., highest magnitude)
plt.scatter(
    x=data_final.loc[data_final['Eclipse Magnitude'].idxmax(), 'Decade'],
    y=data_final['Eclipse Magnitude'].max(),
    color='red',
    s=50,  # Size of the scatter point
    label='Highest Magnitude',
    zorder=5  # Ensures the point is on top
)
plt.legend()

# Improving overall aesthetics with seaborn's despine function to remove top and right borders
sns.despine()

plt.show()

Consultas¶

Consulta 1:¶


Listar la magnitud media de cada eclipse por decada

In [37]:
df_aggregated = data_final.groupby('Decade')['Eclipse Magnitude'].mean().reset_index()
In [38]:
df_aggregated.head()
Out[38]:
Decade Eclipse Magnitude
0 -2000 0.732985
1 -1990 0.777043
2 -1980 0.733739
3 -1970 0.735793
4 -1960 0.791861
In [39]:
# Creating the interactive line plot using Plotly Express
fig = px.line(
    df_aggregated,
    x='Decade',
    y='Eclipse Magnitude',
    title='Average Trend of Eclipse Magnitude Over Decades',
    labels={'Decade': 'Decade', 'Eclipse Magnitude': 'Average Eclipse Magnitude'}
)

fig.update_layout(
    title={'text': "Average Trend of Eclipse Magnitude Over Decades", 'y':0.95, 'x':0.5, 'xanchor': 'center', 'yanchor': 'top'},
    hovermode='x unified'
)

# Optimizing marker visibility for large datasets
fig.update_traces(
    line=dict(width=2, color='Blue'),
    marker=dict(size=4, color='LightSkyBlue', line=dict(width=1, color='DarkSlateGrey')),
    hovertemplate="Decade: %{x}<br>Avg. Eclipse Magnitude: %{y:.2f}<extra></extra>"
)

fig.show()

Consulta 2:¶


Listar el EII (indice de influenza de eclipse) por decada

In [40]:
df_aggregated_2 = data_final.groupby('Decade')['EII'].mean().reset_index()
In [41]:
df_aggregated_2.head()
Out[41]:
Decade EII
0 -2000 0.529093
1 -1990 0.539257
2 -1980 0.509304
3 -1970 0.541058
4 -1960 0.524778
In [42]:
# Creating the interactive line plot using Plotly Express
fig = px.line(
    df_aggregated_2,
    x='Decade',
    y='EII',
    title='Average Trend of Influence Index Over Decades',
    labels={'Decade': 'Decade', 'EII': 'Average Influence Index'}
)

fig.update_layout(
    title={'text': "Average Trend of Influence Index Over Decades", 'y':0.95, 'x':0.5, 'xanchor': 'center', 'yanchor': 'top'},
    hovermode='x unified'
)

# Optimizing marker visibility for large datasets
fig.update_traces(
    line=dict(width=2, color='Green'),
    marker=dict(size=4, color='LightSkyBlue', line=dict(width=1, color='DarkSlateGrey')),
    hovertemplate="Decade: %{x}<br>Avg. Influence Index: %{y:.2f}<extra></extra>"
)

# Display the plot
fig.show()

Consulta 3.1:¶


Como afecta el valor de Gamma en la magnitud del Eclipse según su Tipo

In [43]:
plt.figure(figsize=(10, 6))

sns.scatterplot(data=data_final, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')

plt.title('Eclipse Magnitude vs. Gamma by Type')
plt.xlabel('Gamma')
plt.ylabel('Eclipse Magnitude')

plt.grid(True)
plt.show()

# This scatter plot helps to visualize how the eclipse magnitude is related to the gamma value, differentiated by the type of eclipse.

Consulta 3.2:¶


Como afecta el valor de Gamma en la magnitud del Eclipse según cada Tipo por separado

In [44]:
tipos_filtrados_t = data_final[data_final['Eclipse Type'].str.contains('T', na=False)]
tipos_filtrados_a = data_final[data_final['Eclipse Type'].str.contains('A', na=False)]
tipos_filtrados_h = data_final[data_final['Eclipse Type'].str.contains('H', na=False)]
tipos_filtrados_p = data_final[data_final['Eclipse Type'].str.contains('P', na=False)]
In [45]:
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10))  # Ajusta el tamaño como necesites

# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[0, 0].set_title('Eclipse Magnitude vs. Gamma by Type T')
axs[0, 0].set_xlabel('Gamma')
axs[0, 0].set_ylabel('Eclipse Magnitude')
axs[0, 0].grid(True)

sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[0, 1].set_title('Eclipse Magnitude vs. Gamma by Type A')
axs[0, 1].set_xlabel('Gamma')
axs[0, 1].set_ylabel('Eclipse Magnitude')
axs[0, 1].grid(True)

sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[1, 0].set_title('Eclipse Magnitude vs. Gamma by Type P')
axs[1, 0].set_xlabel('Gamma')
axs[1, 0].set_ylabel('Eclipse Magnitude')
axs[1, 0].grid(True)

sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[1, 1].set_title('Eclipse Magnitude vs. Gamma by Type H')
axs[1, 1].set_xlabel('Gamma')
axs[1, 1].set_ylabel('Eclipse Magnitude')
axs[1, 1].grid(True)

# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()

plt.show()

Consulta 4:¶


Mostrar el Saros Number por cada Decada según el Tipo de Eclipse

In [46]:
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10))  # Ajusta el tamaño como necesites

# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[0, 0].set_title('Saros Number vs. Decade by Type T')
axs[0, 0].set_xlabel('Decade')
axs[0, 0].set_ylabel('Saros Number')
axs[0, 0].grid(True)

sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[0, 1].set_title('Saros Number vs. Decade by Type A')
axs[0, 1].set_xlabel('Decade')
axs[0, 1].set_ylabel('Saros Number')
axs[0, 1].grid(True)

sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[1, 0].set_title('Saros Number vs. Decade by Type P')
axs[1, 0].set_xlabel('Decade')
axs[1, 0].set_ylabel('Saros Number')
axs[1, 0].grid(True)

sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[1, 1].set_title('Saros Number vs. Decade by Type H')
axs[1, 1].set_xlabel('Decade')
axs[1, 1].set_ylabel('Saros Number')
axs[1, 1].grid(True)

# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()

# Mostrar la figura completa
plt.show()

Consulta 5:¶


Mostrar la Distribución Geográfica de los eclipses según su Tipo

In [47]:
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10))  # Ajusta el tamaño como necesites

# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[0, 0].set_title('Latitude vs. Longitude by Type T')
axs[0, 0].set_xlabel('Longitude')
axs[0, 0].set_ylabel('Latitude')
axs[0, 0].grid(True)

sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[0, 1].set_title('Latitude vs. Longitude by Type A')
axs[0, 1].set_xlabel('Longitude')
axs[0, 1].set_ylabel('Latitude')
axs[0, 1].grid(True)

sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[1, 0].set_title('Latitude vs. Longitude by Type P')
axs[1, 0].set_xlabel('Longitude')
axs[1, 0].set_ylabel('Latitude')
axs[1, 0].grid(True)

sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[1, 1].set_title('Latitude vs. Longitude by Type H')
axs[1, 1].set_xlabel('Longitude')
axs[1, 1].set_ylabel('Latitude')
axs[1, 1].grid(True)

# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()

# Mostrar la figura completa
plt.show()

Consulta 6:¶


Mostrar la variación de HEAS(puntuación de alineación del eclipse con significancia histórica) por Year según el Tipo de Eclipse

In [48]:
# Crear una figura con subplots
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10))

# Configurar el color del fondo de la figura
fig.patch.set_facecolor('lightgray')  # Cambia 'lightgray' al color deseado para el fondo de la figura

# Primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[0, 0].set_title('Year vs. HEAS by Type T')
axs[0, 0].set_xlabel('Year')
axs[0, 0].set_ylabel('HEAS')
axs[0, 0].grid(True)
axs[0, 0].set_facecolor('#f0f0f0')  # Cambia el fondo del área de trazado

# Segundo subplot
sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[0, 1].set_title('Year vs. HEAS by Type A')
axs[0, 1].set_xlabel('Year')
axs[0, 1].set_ylabel('HEAS')
axs[0, 1].grid(True)
axs[0, 1].set_facecolor('#f0f0f0')  # Cambia el fondo del área de trazado

# Tercer subplot
sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[1, 0].set_title('Year vs. HEAS by Type P')
axs[1, 0].set_xlabel('Year')
axs[1, 0].set_ylabel('HEAS')
axs[1, 0].grid(True)
axs[1, 0].set_facecolor('#f0f0f0')  # Cambia el fondo del área de trazado

# Cuarto subplot
sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[1, 1].set_title('Year vs. HEAS by Type H')
axs[1, 1].set_xlabel('Year')
axs[1, 1].set_ylabel('HEAS')
axs[1, 1].grid(True)
axs[1, 1].set_facecolor('#f0f0f0')  # Cambia el fondo del área de trazado

# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()

# Mostrar la figura completa
plt.show()

Consulta 7:¶


Mostrar El tipo de eclipse mas comun por día y noche

In [49]:
sns.histplot(x = "Eclipse Type", hue = "Daytime/Nighttime", data = data_final, multiple = "dodge", shrink=0.8)
sns.despine()
plt.show()
In [50]:
# Agrupar y contar las ocurrencias de cada tipo de eclipse dentro de cada grupo de Daytime/Nighttime
series = data_final.groupby(['Daytime/Nighttime','Eclipse Type'])['Eclipse Type'].count()
# Convertir la Serie agrupada en un DataFrame y hacer un unstack para preparar para idxmax
grouped_df = series.unstack(fill_value=0)
grouped_df
Out[50]:
Eclipse Type A A+ A- Am An As H H2 H3 Hm P Pb Pe T T+ T- Tm Tn Ts
Daytime/Nighttime
Daytime 2786 20 21 55 23 12 364 20 16 12 1936 87 77 2252 2 7 57 7 6
Nighttime 969 14 13 17 13 13 138 4 10 5 1939 76 85 797 7 10 15 7 6
In [51]:
# Encontrar el tipo de eclipse con el mayor número de ocurrencias para cada Daytime/Nighttime
most_frequent_type = grouped_df.idxmax(axis=1)
most_frequent_type
Out[51]:
Daytime/Nighttime
Daytime      A
Nighttime    P
dtype: object

Consulta 8:¶


Mostrar que tan visible es un eclipse según el Indice de Influenza(EII) por Tipo de eclipse

In [52]:
# Visibility Score: que tan visible o extenso es el eclipse
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10))  # Ajusta el tamaño como necesites

# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[0, 0].set_title('Visibility Score vs EII by Type T')
axs[0, 0].set_xlabel('Visibility Score')
axs[0, 0].set_ylabel('EII')
axs[0, 0].grid(True)

sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[0, 1].set_title('Visibility Score vs EII by Type A')
axs[0, 1].set_xlabel('Visibility Score')
axs[0, 1].set_ylabel('EII')
axs[0, 1].grid(True)

sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[1, 0].set_title('Visibility Score vs EII by Type P')
axs[1, 0].set_xlabel('Visibility Score')
axs[1, 0].set_ylabel('EII')
axs[1, 0].grid(True)

sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[1, 1].set_title('Visibility Score vs EII by Type H')
axs[1, 1].set_xlabel('Visibility Score')
axs[1, 1].set_ylabel('EII')
axs[1, 1].grid(True)

# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()

# Mostrar la figura completa
plt.show()

Consulta 9:¶


Mostrar el tipo de eclipse mas comun por hemisferio

In [53]:
plt.figure(figsize=(10, 6))
sns.histplot(x = "Eclipse Type", hue = "Geographical Hemisphere", data = data_final, multiple = "dodge", shrink=0.8)
sns.despine()
plt.show()
In [54]:
# Agrupar y contar las ocurrencias de cada tipo de eclipse dentro de cada grupo de Daytime/Nighttime
series = data_final.groupby(['Geographical Hemisphere','Eclipse Type'])['Eclipse Type'].count()
# Convertir la Serie agrupada en un DataFrame y hacer un unstack para preparar para idxmax
grouped_df = series.unstack(fill_value=0)
# Encontrar el tipo de eclipse con el mayor número de ocurrencias para cada Daytime/Nighttime
most_frequent_type = grouped_df.idxmax(axis=1)
most_frequent_type
Out[54]:
Geographical Hemisphere
N E    P
N W    P
S E    A
S W    P
dtype: object

Consulta 10:¶


Mostrar como cambia el valor de gamma con el numero de la serie de Saros

In [55]:
data_final['Saros Number'].value_counts()
Out[55]:
Saros Number
 34     86
 52     86
 51     85
 32     84
 53     84
        ..
 188     7
 189     5
-12      4
-13      2
 190     1
Name: count, Length: 204, dtype: int64
In [59]:
lineplot = sns.lineplot(
    y='Gamma',
    x='Saros Number',
    data=data_final.sort_values('Saros Number'),
    marker='o',  # Adds markers to each data point
    linestyle='-',  # Solid line
    color='royalblue',  # Line color
    linewidth=1.5  # Line width
)

Consulta 11:¶


Visualizar la relacion entre Gamma y la latitud de un eclipse

In [58]:
lineplot = sns.scatterplot(
    x='Eclipse Latitude',
    y='Gamma',
    # y='Eclipse Latitude',
    data=data_final,
    hue='Eclipse Type'
)
# Mover la leyenda a un costado del gráfico
plt.legend(loc='upper right', bbox_to_anchor=(1.05, 1), borderaxespad=0)
plt.show()
In [60]:
tipos_filtrados_t = data_final[data_final['Eclipse Type'].str.contains('T', na=False)]
tipos_filtrados_a = data_final[data_final['Eclipse Type'].str.contains('A', na=False)]
tipos_filtrados_h = data_final[data_final['Eclipse Type'].str.contains('H', na=False)]
tipos_filtrados_p = data_final[data_final['Eclipse Type'].str.contains('P', na=False)]
In [61]:
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10))  # Ajusta el tamaño como necesites

# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[0, 0].set_title('Gamma vs Eclipse Latitude by Type T')
axs[0, 0].set_xlabel('Gamma')
axs[0, 0].set_ylabel('Eclipse Latitude')
axs[0, 0].grid(True)

sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[0, 1].set_title('Gamma vs Eclipse Latitude by Type A')
axs[0, 1].set_xlabel('Gamma')
axs[0, 1].set_ylabel('Eclipse Latitude')
axs[0, 1].grid(True)

sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[1, 0].set_title('Gamma vs Eclipse Latitude by Type P')
axs[1, 0].set_xlabel('Gamma')
axs[1, 0].set_ylabel('Eclipse Latitude')
axs[1, 0].grid(True)

sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[1, 1].set_title('Gamma vs Eclipse Latitude by Type H')
axs[1, 1].set_xlabel('Gamma')
axs[1, 1].set_ylabel('Eclipse Latitude')
axs[1, 1].grid(True)

# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()

# Mostrar la figura completa
plt.show()

Consulta 12:¶


Visualizar la relacion entre el Sun Altitude y Visibility Score

In [62]:
lineplot = sns.lineplot(
    x='Sun Altitude',
    y='Visibility Score',
    data=data_final[data_final['Sun Altitude'] != 0],
    marker='o',  # Adds markers to each data point
    linestyle='-',  # Solid line
    color='royalblue',  # Line color
    linewidth=1.5  # Line width
)

Consulta 13:¶


Distibucion de los eclipses por mes

In [63]:
data_month_grouped = pd.DataFrame(data_final.groupby('Month')['Eclipse Type'].count().reset_index(name='Count'))
In [64]:
sns.set_style("whitegrid")
plt.figure(figsize=(6,6))
plt.pie(data_month_grouped['Count'], labels=data_month_grouped['Month'], autopct='%.2f%%')
plt.title('Distribucion de eclipses por mes')
plt.show()

Consulta 14:¶


Cambio de la magnitud segun la altitud del sol

In [65]:
# Creating the line plot with a more appealing aesthetic
lineplot = sns.lineplot(
    x='Sun Altitude',
    y='Eclipse Magnitude',
    data=data_final[data_final['Sun Altitude'] != 0],
    marker='o',  # Adds markers to each data point
    linestyle='-',  # Solid line
    color='royalblue',  # Line color
    linewidth=1.5  # Line width
)
In [66]:
df_aggregated = data_final[data_final['Sun Altitude'] != 0].groupby('Sun Altitude')['Eclipse Magnitude'].mean().reset_index()
In [67]:
# Creating the interactive line plot using Plotly Express
fig = px.line(
    df_aggregated,
    x='Sun Altitude',
    y='Eclipse Magnitude',
    title='Average Trend of Eclipse Magnitude Over Sun Altitude',
    labels={'Sun Altitude': 'Sun Altitude', 'Eclipse Magnitude': 'Average Eclipse Magnitude'}
)

fig.update_layout(
    title={'text': "Average Trend of Eclipse Magnitude Over Sun Altitude", 'y':0.95, 'x':0.5, 'xanchor': 'center', 'yanchor': 'top'},
    hovermode='x unified'
)

# Optimizing marker visibility for large datasets
fig.update_traces(
    line=dict(width=2, color='Orange'),
    marker=dict(size=4, color='LightSkyBlue', line=dict(width=1, color='DarkSlateGrey')),
    hovertemplate="Sun Altitude: %{x}<br>Avg. Eclipse Magnitude: %{y:.2f}<extra></extra>"
)

# Display the plot
fig.show()

Consulta 15:¶


Relacion entre la magnitud de un eclipse y el mes

In [68]:
lineplot = sns.lineplot(
    x='Month',
    y='Eclipse Magnitude',
    data=data_final,
    marker='o',  # Adds markers to each data point
    linestyle='-',  # Solid line
    color='orange',  # Line color
    linewidth=1.5  # Line width
)

plt.xticks(fontsize=12, rotation=90)
plt.yticks(fontsize=12)

plt.show()

Convertir ipynb a HTML¶

In [69]:
!pip -q install nbconvert
In [70]:
!ls /content
data.csv  eclipse_data_enriched_5000_years.csv	sample_data
In [ ]:
!jupyter nbconvert --to html /content/Eclipse_limpieza_datos.ipynb
[NbConvertApp] WARNING | pattern '/content/Eclipse_limpieza_datos.ipynb' matched no files
This application is used to convert notebook files (*.ipynb)
        to various other formats.

        WARNING: THE COMMANDLINE INTERFACE MAY CHANGE IN FUTURE RELEASES.

Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
    <cmd> --help-all

--debug
    set log level to logging.DEBUG (maximize logging output)
    Equivalent to: [--Application.log_level=10]
--show-config
    Show the application's configuration (human-readable format)
    Equivalent to: [--Application.show_config=True]
--show-config-json
    Show the application's configuration (json format)
    Equivalent to: [--Application.show_config_json=True]
--generate-config
    generate default config file
    Equivalent to: [--JupyterApp.generate_config=True]
-y
    Answer yes to any questions instead of prompting.
    Equivalent to: [--JupyterApp.answer_yes=True]
--execute
    Execute the notebook prior to export.
    Equivalent to: [--ExecutePreprocessor.enabled=True]
--allow-errors
    Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
    Equivalent to: [--ExecutePreprocessor.allow_errors=True]
--stdin
    read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
    Equivalent to: [--NbConvertApp.from_stdin=True]
--stdout
    Write notebook output to stdout instead of files.
    Equivalent to: [--NbConvertApp.writer_class=StdoutWriter]
--inplace
    Run nbconvert in place, overwriting the existing notebook (only
            relevant when converting to notebook format)
    Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory=]
--clear-output
    Clear output of current file and save in place,
            overwriting the existing notebook.
    Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --ClearOutputPreprocessor.enabled=True]
--no-prompt
    Exclude input and output prompts from converted document.
    Equivalent to: [--TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True]
--no-input
    Exclude input cells and output prompts from converted document.
            This mode is ideal for generating code-free reports.
    Equivalent to: [--TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_input=True --TemplateExporter.exclude_input_prompt=True]
--allow-chromium-download
    Whether to allow downloading chromium if no suitable version is found on the system.
    Equivalent to: [--WebPDFExporter.allow_chromium_download=True]
--disable-chromium-sandbox
    Disable chromium security sandbox when converting to PDF..
    Equivalent to: [--WebPDFExporter.disable_sandbox=True]
--show-input
    Shows code input. This flag is only useful for dejavu users.
    Equivalent to: [--TemplateExporter.exclude_input=False]
--embed-images
    Embed the images as base64 dataurls in the output. This flag is only useful for the HTML/WebPDF/Slides exports.
    Equivalent to: [--HTMLExporter.embed_images=True]
--sanitize-html
    Whether the HTML in Markdown cells and cell outputs should be sanitized..
    Equivalent to: [--HTMLExporter.sanitize_html=True]
--log-level=<Enum>
    Set the log level by value or name.
    Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
    Default: 30
    Equivalent to: [--Application.log_level]
--config=<Unicode>
    Full path of a config file.
    Default: ''
    Equivalent to: [--JupyterApp.config_file]
--to=<Unicode>
    The export format to be used, either one of the built-in formats
            ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'slides', 'webpdf']
            or a dotted object name that represents the import path for an
            ``Exporter`` class
    Default: ''
    Equivalent to: [--NbConvertApp.export_format]
--template=<Unicode>
    Name of the template to use
    Default: ''
    Equivalent to: [--TemplateExporter.template_name]
--template-file=<Unicode>
    Name of the template file to use
    Default: None
    Equivalent to: [--TemplateExporter.template_file]
--theme=<Unicode>
    Template specific theme(e.g. the name of a JupyterLab CSS theme distributed
    as prebuilt extension for the lab template)
    Default: 'light'
    Equivalent to: [--HTMLExporter.theme]
--sanitize_html=<Bool>
    Whether the HTML in Markdown cells and cell outputs should be sanitized.This
    should be set to True by nbviewer or similar tools.
    Default: False
    Equivalent to: [--HTMLExporter.sanitize_html]
--writer=<DottedObjectName>
    Writer class used to write the
                                        results of the conversion
    Default: 'FilesWriter'
    Equivalent to: [--NbConvertApp.writer_class]
--post=<DottedOrNone>
    PostProcessor class used to write the
                                        results of the conversion
    Default: ''
    Equivalent to: [--NbConvertApp.postprocessor_class]
--output=<Unicode>
    overwrite base name use for output files.
                can only be used when converting one notebook at a time.
    Default: ''
    Equivalent to: [--NbConvertApp.output_base]
--output-dir=<Unicode>
    Directory to write output(s) to. Defaults
                                  to output to the directory of each notebook. To recover
                                  previous default behaviour (outputting to the current
                                  working directory) use . as the flag value.
    Default: ''
    Equivalent to: [--FilesWriter.build_directory]
--reveal-prefix=<Unicode>
    The URL prefix for reveal.js (version 3.x).
            This defaults to the reveal CDN, but can be any url pointing to a copy
            of reveal.js.
            For speaker notes to work, this must be a relative path to a local
            copy of reveal.js: e.g., "reveal.js".
            If a relative path is given, it must be a subdirectory of the
            current directory (from which the server is run).
            See the usage documentation
            (https://nbconvert.readthedocs.io/en/latest/usage.html#reveal-js-html-slideshow)
            for more details.
    Default: ''
    Equivalent to: [--SlidesExporter.reveal_url_prefix]
--nbformat=<Enum>
    The nbformat version to write.
            Use this to downgrade notebooks.
    Choices: any of [1, 2, 3, 4]
    Default: 4
    Equivalent to: [--NotebookExporter.nbformat_version]

Examples
--------

    The simplest way to use nbconvert is

            > jupyter nbconvert mynotebook.ipynb --to html

            Options include ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'slides', 'webpdf'].

            > jupyter nbconvert --to latex mynotebook.ipynb

            Both HTML and LaTeX support multiple output templates. LaTeX includes
            'base', 'article' and 'report'.  HTML includes 'basic', 'lab' and
            'classic'. You can specify the flavor of the format used.

            > jupyter nbconvert --to html --template lab mynotebook.ipynb

            You can also pipe the output to stdout, rather than a file

            > jupyter nbconvert mynotebook.ipynb --stdout

            PDF is generated via latex

            > jupyter nbconvert mynotebook.ipynb --to pdf

            You can get (and serve) a Reveal.js-powered slideshow

            > jupyter nbconvert myslides.ipynb --to slides --post serve

            Multiple notebooks can be given at the command line in a couple of
            different ways:

            > jupyter nbconvert notebook*.ipynb
            > jupyter nbconvert notebook1.ipynb notebook2.ipynb

            or you can specify the notebooks list in a config file, containing::

                c.NbConvertApp.notebooks = ["my_notebook.ipynb"]

            > jupyter nbconvert --config mycfg.py

To see all available configurables, use `--help-all`.